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Abstract. Wc prove the following extension of the Wiener-Wintncr Theorem and the 
Carleson Theorem on pointwise convergence of Fourier series: For all measure preserving 
flows (X,/j,,T t ) and / <G L P {X, /i), there is a set Xf C X of probability one, so that for all 
x G Xf we have 

lim / e m /(T t x) — exists for all 6. 

si0 Js<\t\<l/s t 

The proof is by way of establishing an appropriate oscillation inequality which is itself an 
extension of Carleson's theorem. 



1. The Main Theorem 

We are concerned with quantitative inequalities related to the pointwise convergence of 
singular integrals that are uniform with respect to modulation. To state our results, define 
dilation and modulation operators by 

(1.1) Dil( p) f{x) = s~ l/p f(x/s), 0<s,p<oo. 

Dil^)/(x) * f(x/s), 0< S <oo. 



(1.2) Mod 5 /(x) d = e tx *f(x), (Gl. 

Let K be a distribution. The most important example will be Kh(i/) = f C(y)^, where ( 
is a smooth, symmetric, compactly supported function. This is a distribution associated to 
a truncation of the Hilbert transform kernel. 

Our principal concern is the convergence of terms (Dil^ K) * f(x) in a pointwise sense, 
and in one that is, in addition, uniform over all modulations. To do this, we use the following 
definition. 

oo 

(1.3) Osc n (K ; ff d ^ £ sup | [{m% K) - (Dil« n K)} * f\\ 

■ , ki<l<l'<ki+i 

j=l 3- 
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This definition depends upon a choice of an increasing sequence of integers kj £ Z, a depen- 
dence that we suppress as relevant constants are independent of the choice of {kj}. It also 
depends upon a choice of positive integer n, which we have incorporated into the notation. 
This only permits dilations of the form 2 l / n for integers /. 

1.4. Theorem. Fix a smooth, symmetric, compactly supported function G For integers n > 
and 1 < p < oo there is a constant C n , iP ^ so that we have the inequality 

(1.5) ||supOsc n (tftf ; Modjv/)|L < C^cII/IIp- 

TTie inequality holds for all choices of increasing sequences {kj : j > 1} satisfying kj + \ > 
kj + n. 



Our primary interest in this theorem is the corollary below, which is a Hilbert transform 
counterpart to the well known Wiener-Wintner theorem for ergodic averages. Deriving the 
corollary below is a standard part of the literature, with the roots of the argument going back 
to Calderon [8] . The use of an oscillation inequality to establish convergence was introduced 
by Bourgain [7]. Also see the papers of Campbell et al. [10], and Jones et al. [14]. 

1.6. Corollary. For all measure preserving flows {T t : t £ R} on a probability space (X,/i) 
and functions f £ L p (fi), there is a set Xf C X of probability one, so that for all x £ Xf we 
have 

f dt 
lim / e m f(T t x) — exists for all 9. 

sl ° Js<\t\<l/s t 



This is a common extension of two classical theorems: Carleson's Theorem [1 1] on Fourier 
series with Hunt's extension [13], and the Wiener-Wintner Theorem [22] on ergodic averages. 



Carleson's Theorem. We have the inequality 

dy 



sup 

N 



Mod N f(x-y) 



y 



< 



v ■ 



1 < p < oo 



Wiener-Wintner Theorem. For all measure preserving flows {T t : t £ M} on a probability 
space (X,/i) and functions f £ L P (X, fi), there is a set Xf C X of probability one, so that 
for all x £ Xf we have 



lim s 1 

s— »oo 

lim s _1 

s->0 



e m f(T t x)dt exists for all 9 

s 

e m f(T t x)dt exists for all 9. 



The Wiener-Wintner Theorem can been seen as an extension of the Birkhoff Ergodic 
Theorem. The Carleson Theorem is a deep result from the 60 's, and since then several proofs 
have been offered. An extensive survey and bibliography on this subject can be found in [15]. 
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The possibility of extending the Wiener- Wintner Theorem to the setting of the Hilbert 
transform was first raised in the paper of Campbell and Petersen [9]. The specific result 
proved there was essentially Carleson's Theorem on the integers, with a transference to 
measure preserving systems. Part of this was contained in a prior work of Mate [18], a work 
that was overlooked until much later. 

Assani [1,2] proved our Corollary 1.6 on different classes of dynamical systems. Indeed, 
he formulated the concept of a Wiener-Wintner system. In this nomenclature, our corollary 
states that all measure preserving systems are Wiener-Wintner systems. 

Our tool to prove convergence in the Hilbert transform setting is the oscillation inequality 
(1.5), an idea first employed in ergodic theory in the pioneering work of Bourgain on the 
ergodic theorem along arithmetic sequences [7]. The use of oscillation has subsequently been 
systematically studied in e.g. [10, 14] and in references therein. 

The main goal of the paper is a proof of Theorem 1.4. Clearly, we follow the lines of a 
proof of Carleson's Theorem. In particular we employ the Lacey-Thiele approach [17] and 
refine one part of it to deduce our main theorem. We will also appeal to the 'restricted weak 
type argument' of C. Muscalu, T. Tao, and C. Thiele [20] and L. Grafakos, T. Tao, and 
E. Terwilleger [12]. 

Acknowledgment. The authors have benefited from conversations with Jim Campbell, An- 
thony Quas and Mate Wierdl. Part of this research was completed at the Schrodinger Insti- 
tute, Vienna Austria. For one of us (ML), discussions with Karl Petersen about this question 
formed our introduction to Carleson's theorem, for which we have been indebted to him ever 
since. 



2. Deduction of Theorem 1.4 

There are two more technical estimates that we prove. Specifically, let if) be some Schwartz 
function which satisfies 

(2.1) 0<fe)<C7 , 

(2.2) if) is supported in [—2, —|], 
(2-3) My)\<C 1 mm(\y\-'',\y\'')- 

Here, v will be a large constant whose exact value we need not specify. And we will not have 
complete freedom in precisely which Schwartz function if) we can take here. It should arise in 
a particular way described in the proof of Proposition 2.4, and will be nonzero! The purpose 
of this section is to describe how a particular result for any choice of non zero if) as above 
will lead to a proof of our main theorem. 
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Consider the distribution 

oo 
u=l 

We will prove the following two propositions in the next section. 

2.4. Proposition. With the assumptions (2.2) — (2.3), the inequality (1.5) holds with n = 1 
and the distribution Kh replaced by 

2.5. Proposition. We have the inequality 

°° 1/2 

(2.6) sup[^ |{Di4^}*Mod^(/)| 2 l <||/|| PI Kp<oo. 



Note that for fixed modulation, (2.6) is a Littlewood Paley inequality, making the inequality 
above a "Carlesonized Littlewood Paley" inequality. Inequalities like this have been proved 
by Prestini and Sjolin [21]. They also follow from the method of Lacey and Thiele. 

Both propositions follow from our Proposition 3.9 of the next section, which is phrased in 
a language conducive to the methods of Lacey and Thiele [17]. These methods have been 
applied in a number of variants of Carleson's theorem, see e.g. Pramanik and Terwilleger 
[19] and Grafakos, Tao and Terwilleger [12]. 

We turn to the deduction of Theorem 1.4. Observe that the two previous propositions 
immediately prove that when we consider dilations which are powers of 2 1 /'™ we have 

||supOsc n (^;Mod 7V /)|| P < n\\f\\ p , neN, 1< p < oo. 

N 

Thus we need not concern ourselves with this feature of Theorem 1.4 and Corollary 1.6. 
For a distribution K, set 

= sup 1 1 sup Osc 1 (A'; Modjv /) 
II/IIp=i " 

Note that since our definition incorporates differences, this is a seminorm on distributions 
K. That is, it obeys the triangle inequality (which we use), but can be zero for non zero 
distributions. In particular, for a Dirac point mass S we have ||5||*, p = 0, and similarly for 
the distribution K with K = l[o,oo)- 

Our task is to show that ||-K#||* !P < oo , where Ku(y) = y~ 1 ((y) for some smooth symmet- 
ric, compactly supported Schwartz function. Our Proposition 2.4 is, with this notation, the 
assertion that < oo. The same inequality will hold for a kernel which can be obtained 

as a convex combination of dilations of ip and ty. Thus, set 
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In this integral, we are careful to integrate against the measure — , which is the Haar measure 
for the positive reals under multiplication, the underlying group for the dilation operators. In 
particular, it follows that \l/o is a distribution whose Fourier transform is a nonzero constant 
on (—00, —1) and is on (— |, 00). Thus by Proposition 2.4, we clearly have H^olUp < °°- 

Now we will show that ||Z)o||*, p < 00 for the distribution 



D (y)=y- 1 ay)-c(Vo(y)-My))i 

where we choose the complex constant c so that lim^oo D Q (£) = 0. In fact, it is a well known 
elementary fact that for c = in, 

(2.7) f C(y)e^^ = c + Om~ 1 ). 

J y 

We will decompose the distribution Do into a sum which can be treated with Proposition 2.5. 
Then using that ||\l/o||*p < 00 an d ||-Do||*p < °°> we obtain the desired inequality for K(y) = 

y~\{y)- 

Choose x to be a smooth function supported on | < |£| < 2 so that 

Dil£ hX =lR- { 0}, 

k=— 00 

and set = DoDd^fe X- The following lemma finishes the proof of Theorem 1.4. 
2.8. Lemma. We have 

l|A fc |k P <2-l fc l, keZ. 



Proof. We will verify that 



(2.9) 

(2.10) 

(2.11) 



IIAfclloo <2" |fc| , keZ, 
A k is supported on 2" fc ~ 1 < |f | < 2~ fc+1 , 



|A*(y)| <2- fe H fc l(l 



k e Z, ye 



with implied constants independent of k G Z and v the large, unspecified constant that 
appears in (2.3). With decay in \k\ in both (2.9) and (2.11), the lemma then follows from a 
trivial change of scale and from Proposition 2.5. 

Let us recall the trivial estimate which follows from the symmetry of £, 

pity 



2.12 



1^(01 



C(y) — d v 
y 



< 



lei- 
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In addition we have the estimate below, applied for |£| < 1 



(2.13) 



C{y)y w - l ^y dy 



< 



|£| w even 
w odd. 



Whereas for |£| > 1, we have 



(2-14) 



§bKh{&) < lei 



|£| > 1, < w < v . 



That is, we have very rapid decay in a large number of derivatives. 

Now, (2.10) is true by definition of To see (2.9) for k > 2, note that this is only 

determined by the Fourier transform of Kh since \l/o and ^/q are zero. The result easily 
follows by the inequality (2.12) and property (2.10). For k < 2, the inequality follows from 
the construction of D Q , and in particular the property in (2.7). 

We turn to the last condition, (2.11). It is well known that decay of order v in spatial 
variables is implied by differentiability of a function in frequency variables. Observe that 



-v dy_ 



Hence, 



(^A fe (y)r(0|<E 2fc 



kw\ d v - 



d^~ 



w=0 



sup K H {£) 

2- fc - 1 <|£|<2-'=+ 1 



For k > 1, this sum is dominated by the last two terms. To control them, use (2.13), 
supplying the estimate < 2 ( - u ~ 1 ^ k . This is better by a factor of 2~ k than the trivial estimate, 
so that Fourier inversion proves (2.11) in this case. 



The case of k < is easier, due to the rapid decay in (2.14). 



□ 



3. Decomposition and Main Proposition 

We state the definitions needed for the main proposition and conclude this section with 
the argument of how this proposition proves the results of the previous section, namely 
Proposition 2.4 and Proposition 2.5. 

In addition to the modulation and dilation operators in (1.1) and (1.2), we need translation 
operators 



(3.1) 



Tran^ f(x) = f(x-y), ye 
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We set D to be the dyadic grid and say that JxwGDxDisa tile iff \uj\ ■ \I\ = 1. Let T 
denote the set of all tiles. 

We think of uj as a frequency interval and I as a spatial interval; our definition of a tile 
is a reflection of the uncertainty principle for the Fourier transform. We will plot frequency 
intervals in the vertical direction. Each dyadic interval u is a union of two dyadic intervals 
of half the length of u. We call them u± and view uo + as above 

We take a fixed Schwartz function <p with frequency support in the interval [—l/v,l/u]. 
For a tile s = I s x u s , define 

(3.2) ip s = Mod c{uJs _) Tran c (/ s) Dfl^j ip. 

Here, c(J) is the center of the interval J, and uo s _ is the lower half of the interval u s . Thus, 
this function is localized to be supported in the time frequency plane close to the rectangle 

Is X ^s— 

There are companion functions which depend on different choices of certain measurable 
functions. These functions should be thought of as those choices of modulation and indices 
that will achieve, up to a constant multiple, the supremums in the oscillation function. To 
linearize the modulation, let 

N : R — > 1 be a measurable function (a modulation parameter). 

We define another function related to the rectangle I s x uo s+ which tells us when the linearized 
modulation parameter is at a certain frequency. Let 

(3.3) <j> a (x) = l„ s+ {N(x))tp s (x). 
Now define a tile variant of the oscillation operator by 

(3.4) Tile-osc(/) d = f sup V (f,^ s )cf> s } . 

l j=1 %<i<«'<% +1 1 s6T J 
2 ! <|/ s |<2 i ' 

Here, an increasing sequence of integers {kj : j > 1} are specified in advance. We make the 
definition for clarity's sake, as we will not explicitly work with it. Rather we prefer to fully 
linearize this maximal operator. This requires the additional choices of functions 

(3.5) oij : E — > K, ^a^x)] 2 < 1, for all x , 

3=1 

(3.6) Zj- t tj+ ■ K — > Z, kj < lj- < £ j+ < k j+1 . 
And we set 

def 



(3.7) F aJ = {x : 2 e >-^ < < 2^+^}, 

(3.8) fs,j(x) = 1 F < (x)aj(x)<i> s (x). 
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The sequences of functions £j± are selecting the level at which the maximal difference occurs. 
The o.j are chosen to realize the £ 2 norm in the definition of oscillation. We make all of these 
choices in order to linearize the oscillation operator. 

Our main proposition is 

3.9. Proposition. For all choices of N(x) and increasing sequence of integers {kj}, the 
operator Tile-osc extends to a bounded sub linear operator on L p , 1 < p < oo. In particular, 
for sets G,H ct of finite measure, we have 

(3.10) ^|(l G ,^>(lif,/ s ,i>|< min(|G|,|tf|)(l + |loggL|). 

Note that the inequality above implies that 

|<Tile-osc(l G ),ltf>| < IGl^lifl 1 - 1 ^, Kp<oo. 

That is, we have the restricted weak type inequality for all 1 < p < oo. 1 Hence, an interpo- 
lation argument will give us the estimate 

(3.11) ||Tile-osc(/)|| p < \\f\\ p , Kp<oo. 



The Deduction of Proposition 2.4 and Proposition 2.5. For (6K and I E Z, consider 
the operators 

|/ s |=2 ; 

The tile oscillation operator is built up from these operators. Observe that these operators 
enjoy the properties 

(3.12) A^z Trans n2 ; = Trans n2 i A^, n E Z, 

(3.13) A*,, Dilfi = Dilfi A^. ll>l+l „ I' E Z, 

(3.14) A $ii Mod_ e = Mod_ e A i+ g th 9eR. 

Notice that these conditions tell us that the operators A^ have a near translation invariance, 
a certain modulation invariance, and are related to each other through dilations. In addition, 
these operators are bounded on L? uniformly in £ and /, a fact well represented in the 
literature. 

We will now define 

d f 1 

B^if = lim — — / / Mod_ e Trans_ y A (5+e) /Trans y Mod e /) dyc/6> . 
K^oo AKL J_ K J_ L 

L — >oo 



1 In fact, the estimate (3.10) gives a favorable upper bound on the behavior of the constant with respect 
to p, namely that they are no more than max(p, j^j)- See [12] 
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By periodicity of the integrand in y and 6, for all Schwartz functions /, the averages on the 
right hand side converge pointwise to f(x) as K, L — > oo. 

Let us make some observations about the operators B^/. First, (3.12) and periodicity of 
the integrand in y imply Bg / commutes with translations. Second, it is a bounded, positive, 
semidefmite operator, as is easy to see. Hence, it is given by convolution. Indeed, (3.14) 
implies that 

B w / = Mod f A*(Mod_ s /). 

for a function (3i that we turn to next. The equality (3.13) implies that f3i = Dil^l /? where 
(3 is given such that (3q is a smooth Schwartz function satisfying the conditions (2.1) — (2.3), 
a routine exercise to verify. 

Assuming Proposition 3.9, it follows that we can conclude Proposition 2.4 and Proposi- 
tion 2.5 for nonzero functions ip = (3q. Our proof is complete. 



4. Main Lemmas 

To prove (3.10), we split the sum over s G T into the sum over s such that I s C {M 1 G > A} 
and the sum over s such that I s <£_ {M \q > A}. The former sum can be taken care of by an 
argument of M. Lacey and C. Thiele [16] which also appears, slightly modified, in the paper 
of L. Grafakos, T. Tao, and E. Terwilleger [12]. Thus we restrict our attention to the tiles s 
where I a ^{M1 G >A}. 

We begin with some concepts needed to phrase the proof. There is a natural partial order 
on tiles. We say that s < s' iff uj s D u s > and I s C I s >. Note that the time variable of s 
is localized to that of s', and the frequency variable of s is similarly localized, up to the 
variability allowed by the uncertainty principle. Note that two tiles are incomparable with 
respect to the '<' partial order iff the tiles, as rectangles in the time frequency plane, do not 
intersect. A "maximal tile" will be one that is maximal with respect to this partial order. 

Let S denote an arbitrary set of tiles. We call a set of tiles T C S a tree if there is a 
tile It x ^t> called the top of the tree, such that for all s e T, s < It x ^t- We note that 
the top is not uniquely defined. An important point is that a tree top specifies a location in 
time variable for the tiles in the tree, namely inside It, and localizes the frequency variables, 
identifying ujt as a nominal origin. 

We say that the count of S is at most A iff S = [Jtcs wri ere each T c T is a tree 
which is maximal with respect to inclusion and 

Count (S) = 53 1 It | < A. 

TCS 
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Fix x( x ) = (1 + M) v -i where v is, as before, a large constant whose exact value is 
unimportant to us. Define 

(4.1) xi ■= Trans c(/) Dilj^ x, 

(4.2) dense(s) := sup / xi a , dx, 

s<s> J N- 1 (uj s ,)nH 

dense (iS) := sup dense (s), S cT. 
ses 

The first and most natural definition of a "density" of a tile, would be \I s \~ 1 \N~ 1 (uj s+ )r\I s \. 
However if is supported on the whole real line, although it does decay faster than the inverse 
of any polynomial. We refer to this as a "Schwartz tails problem." The definition of density 
as f N -i/ u ) Xh dx, as it turns out, is still not adequate. That we should take the supremum 
over s < s' only becomes evident in the proof of the "Tree Lemma" below. 

The "Density Lemma" is 
4.3. Lemma. Any subset S C T is a union of <Shcav y o,nd iSught for which 

dense (Slight) < \ dense (S), 
and the collection S heavy satisfies 

(4.4) Count(5 h eavy ) < dense^ 1 \H | . 

What is significant is that this relatively simple lemma admits a non-trivial variant inti- 
mately linked to the tree structure and orthogonality. We should refine the notion of a tree. 
Call a tree T with top It x a ±tree iff for each s G T, aside from the top, It x n/ s x u s ± 
is not empty. Any tree is a union of a +tree and a —tree. If T is a +tree, observe that the 
rectangles {I s x u; s _ : s G T} are disjoint. We see that 

EK/>^>I 2 £ 11/111- 

This motivates the definition 

(4.5) size(S) := sup{|/ T r 1/2 ^| (/, f a ) ? ■ T C S, T is a +tree}. 
The "Size Lemma" is 

4.6. Lemma. Assume that f = Iq- Any subsets C T is a union of S\>i g and <S sma ii for which 

size(5 sma ii) < |size(5), 

and the collection Sug satisfies 

(4.7) Count (5big) < size(5)- 2 |G|. 
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Concerning the quantity size, we need an additional piece of information about it 
that M is the Hardy Littlewood maximal function. 

4.8. Lemma. Let < A < 1, and suppose that S is the set of tiles with 

/ S ^{M1 G >A}, seS. 
Then it is the case that size(«S) < A. 

This fact, a delicate consequence of the Calderon-Zygmund decomposition, will not be 
proved in this paper. It, like the Size Lemma and the Density Lemma, is already well 
represented in the literature. See, for example, [12]. For proofs of the Density and Size 
Lemmas, we refer the reader to [17]. The survey [15] is also suggested. 

For a set of tiles S, set 

j€N seS 

Our final lemma relates trees, density and size. It is the "Tree Lemma." 

4.9. Lemma. For any tree T 

(4.10) Sum(T)< size(T)dense(T)|/ T |. 

Of course for any set of tiles S, we would then have 

Sum(S) < ^size(T)dense(T)|/ T |. 

TcS 

Thus, we should inductively apply Lemma 4.3 and Lemma 4.6 so that the 'Count' estimates 
are essentially equal. The formal proof of Proposition 3.9, which is much as it appears in 
Lacey and Thiele [17] with the adaptation to a restricted weak type inequality as seen in 
[12], is left as an exercise for the reader. 

5. Proof of Lemma 4.9 

The tree lemma, with its adaptation to the setting of oscillation, is the primary new step 
in this paper. 

We begin with some remarks about oscillation operators, and a particular form of the same 
that we shall use at a critical point of this proof. Let ( be a smooth function with Fourier 
transform supported in [— 1 — e, 1 + e] for a fixed, small, positive e and equal to 1 on [—1,1]. 
Set 

OO 

osc(/) 2 ^ su p \H\ c * / - Dil i/'i c*/i 2 - 

j=l 2 k i <\I\<\I'\<2 k 3+ 1 
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It is known that this is bounded on L 2 , and in this situation we will give an elementary proof 
of this fact below. 

We shall have recourse to not only this bound, but a particular refinement. Let J be 
a partition of R into dyadic intervals. To each J £ J , associate a subset E(J) C J with 
\E(J)\ <S\J\, where < 5 < 1 is fixed. Consider 

oo 

(5.1) Osc 5 (/) 2 d ^ f Mj) E su P |l/(C/,/)-l/'(C/',/)| 2 

Je 7 7=1 -/c/c/' 

2 fe J<|/|<|/'|<2 fe J+ 1 

We estimate the norm of this operator. 

5.2. Lemma. VFe /iave the estimate 

(5-3) l|Osc 5 (/)|| 2 <V5||/|| 2 

/or all f e L 2 . 



Proof. Let us begin with a proof that 1 1 Osc 1 1 2 ^2 i$ 1- That is, we do not have the additional 

information about the partition J, and sets E(J) for J £ J . For a sequence of increasing 
integers kj and function / £ L 2 , set 

/j = ^-2~ k i+ l ~ 1 <\i\<2~ k i +1 f 

Then, we certainly have X)j6nII//II! — ^ll/lli- Moreover, due to our assumption about the 
function (, 

sup iDUfJ C * /I < M /,•_! + M + M 



2^'<|/|<|/'|<2^+ 1 



where M is the usual maximal function. Thus, by the boundedness of the maximal function 
on L 2 we have 

l|Osc(/)|| 2 <3^||M/,|| 2 <||/|| 2 . 



It is hardly surprising that the proof above appeals to the boundedness of the maximal 
function, as the estimate on the oscillation operator implies that for the maximal function. 
Likewise, our lemma implies a bound for a certain variant of the maximal function. As it 
turns out, we need this variant in the course of the proof. 

Define 

Mi/(z) d = Y] l E (j){x)8wp(\f\,xi) 
Jtj JCI 

where Xi 1S defined as in (4.1). Then the estimate we claim is ||M,5|| 2 < y5. 
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Indeed, for any point x G E(J), we have the inequality 

M 5 f(x)<mfMf(y). 

where M is the usual maximal function. Therefore, we can estimate 



\\M s f\\l = J2 I M s f(x) 2 dx 



JeT JE(J) 

<J2\E(J)\iniMf(yf 



< 5 J Mf(x) 2 dx. 

This proves our claim. 

To conclude the proof, we can estimate 

f Osc 5 (/)(x) 2 dx < J2W M sfj 

<*E^ 



7112 



Our proof is complete. 



□ 



We begin the main line of the argument. Let 5 = dense(T), and a = size(T). By a 
modification of the functions otj{x) by a choice of signs, we can assume the identity 

EEi^'^^'' 1 ^!" / EE^ 1g, ^^w dx - 



As we have no particular control on the set H, we will need the following partition of the 
real line induced by the tree T. Let J be the partition of R consisting of the maximal dyadic 
intervals J such that 3J does not contain any I s for s G T. It is helpful to observe that for 
such J, if \ J\ < |7 T |, then J C 3ir, and if \ J\ > \It\, then dist(J, It) > \ J\- The integral 
above is at most the sum of the two terms below. 

(5-4) EE E KW.)! / 

\I S \<2\J\ 
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(5-5) EE / |E (1g,^> dx 

jeN JeJ JnH seT 

\I S \>2\J\ 

Notice that for the second sum to be non-zero, we must have J C 3/t- 

The first term (5.4) is controlled by an appeal to the "Schwartz tails." Fix an integer 
n > — 1, and only consider those s £ T for which |/ s | = 2~ n \ J\. Recalling that f s j{x) = 
l Fsj (x)a j (x)l U3+ (N(x))if s (x), we see that 

E E KW->I / \fsj(x)\dx< E ^(I/.I-Mistc/^J))- 10 !/,! 

j en seT JJnH seT 

\I s \=2- n \J\ |J a |=2-"|J| 

<a<52-"min(|J|,|J|(|/ T r 1 dist(J,/ T ))- 5 ), 

Observe that for each s above, only one value of j contributes to the left hand sum. In 
addition, we have used the the fact that there are only a bounded number of tiles s for which 
|/ s |~ 1 dist(/ a , «/) is essentially constant. In addition, for the case \ J\ < \It\, we used that 
the distance from I s to J is at least > \ J\. In the case \ J\ > |/ T |, use |/ s | _1 dist(/ s , J) > 
| -^t | — 1 dist ( J, It)- The estimate above can then be summed over n > — 1 and J £ J to bound 
(5.4) by < ct(J|Jt|, as required. 

Now we turn to the control of (5.5). The integral in this quantity is supported in the set 

(5.6) E(j) = jn |J (N-^w^nH). 

\I S \>2\J\ 

Then the critical observation is that |^(J)| ^ $\J\- To see this, let J 1 be the next larger 
dyadic interval that contains J. Then 3J' must contain some I s > for s' £ T. Then there 
exists a tile s" with I s > C I s » C It such that |/ s »| is 2\J\ or 4| J\, and ujt C uj s » C lu s >. Then, 
s' < s", and by the definition of density, 

\JnH niV _1 (o;.,» 



< / Xi s „ dx <5 

l J s"| J HnN- l {Lj s „) 

But, for each s as in (5.6), we have uj s C so that -E(J) C N~ l {u s ii). Our claim follows. 

Suppose that T is a —tree. This means that the tiles {I s x u s+ : s £ T} are disjoint and 
thus the functions f s j are disjointly supported. In particular, the oscillation that arises from 
such functions is trivially bounded by their £°° norm. Then the bound for (5.5) is no more 
than 

Ei^)i|| E kw.)/*i| 

jSN sGT 

|/ a |>2|J| 

This is summed over J C 3/t to get the desired bound. 
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Suppose that T is a +tree. This is the interesting case. At this point, we will appeal to 
the norm bound for oscillation, (5.3), applied to the function 



I = f M0d- e (q; T ) y^(lg, ¥s)¥s- 



This is an assumption that can be assumed by an appropriate modulation of the fixed L 2 
function /. In the definition of T, it is useful to us that we only use the "smooth" functions 
ip s in the definition of this function. Note that ||r|| 2 < cx^/|/ T |, which is a consequence of 
the definition of size and the (near) orthogonality of the functions ip s in the case of +tree. 

The purpose of these next remarks is to relate the sums over a +tree to oscillation. Recall 
that the oscillation is defined relative to a sequence of integers kj. For each J, consider x G J 
and integers £ such that max(2| J|, 2^-^) < 2 e < 2^ +(x ). We have 



Y ( 1 G, l Ps)fs,j(x) = Y (lG,tp s )Vs(x)oij(x) 



\Is\=2* 



\h\=2 e 



This is because all of the intervals uj s+ are nested and must contain w T , and if N(x) G ou s+ , 
then it must also be in every other uj s i + that is the same size or larger. What is significant 
here is that on the right we have a particular scale of (a modulation of) the sum that defines 

r. 



Furthermore, consider the functions 

?jA X ) = Mod -c(c; T ) Y (1g,<P*)<P,- 

max(2|J|,2 £ J- (a:) )<|/ 3 |<2^+( 3: ) 

In particular, we can choose ( as in the definition of our oscillation operator (5.1) so that 

Dilg } C * T = Mod_ c(wT) <p.)<p.. 



\Is\>2 1 



Therefore, we have 



3, J 



Dil (1) 



max(2|J|,2^'-W) ^ 



;i(l) 



*r. 



We conclude that for x G E(J), 

oo 

Y Y ( 1 G^s)fsJ^ 

j=i seT 

2|J|<|/ S | 



j= i / V j= i 



< 



Osc$ r(x) 
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where we are using the oscillation operator defined in (5.1). We are able to use this operator 
here since 2\J\ < \I S \ and 3J does not contain any I s , which implies that J C 3I S . 



The conclusion of this proof is now at hand. By Lemma 5.2 we have 

» oo „ 

Y / Y Y ( 1 G^s)fs,j(x) dx< |Osc 5 r(x)| dx 



Jej 
|J|<3|/ T 



2|J|<|/ S | 



< 



U E ( J ) 

|J|<3|/ T | 



V2, 



|Osc^ r| 



<5v^!ir|| 2 

<7<J|ir|- 



< 



6. Concluding Remarks 

Let us pose a conjecture concerning the kernel Jh{u) == 1[— 1 , that is the Hilbert 
transform kernel with a sharp cut off. 

6.1. Conjecture. We have the inequality valid for all n > 1 

||Osc n (J H ;/)|| P < \\f\\ P , Kp<oo, 
In fact, the implied constant can be taken independent of n. 

The proof as currently presented doesn't permit the deduction of this. Given the central 
role the Fourier transform plays in our proof, the technical difficulty we come to has a succinct 
description in terms of Jh- Namely, the variation of Jh is infinite. But as the variation is 
only logarithmically infinite, one suspects that a proof of the conjecture above would have 
to revisit the proof of Carleson's theorem, with this example in mind. 

6.2. Corollary. For any measure preserving system (X, /i, T) and f e L p (X,fi) for 1 < p < 
oo , there is a set Xf of probability one for which for all x G Xf 

lim > f(T k x) exists for all 9 . 

0<\k\<N 



The proof would begin by transferring the oscillation inequality in Theorem 1.4, valid on 
R, to the integers Z. This kind of transference can be done directly; it is also possible that 
the necessary result follows from known transference results such as Auscher and Carro [5]. 
Details are left to the reader. 
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Likewise, the method of proof that we employ throughout the paper could be adapted 
to shed light on more general singular integrals, as well as the original Wiener-Wintner 
Theorem. Indeed, an oscillation result could be proved for the latter theorem. We do not 
however pursue these lines here. 

The Wiener-Wintner Theorem has a deep extension to the Return Time Theorem of 
Bourgain [6], see also the appendix to [7]. This Theorem, which we don't recall in detail here, 
has certain extensions and variants that are currently only approachable via the phase plane 
methods of the type used in this paper. The Return Time is however a more sophisticated 
result, and the phase plane methods required are correspondingly more difficult. These issues 
will be explored in forthcoming papers of C. Demeter, M. Lacey, T. Tao, and C. Thiele. 
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